Optimized inferences of finite population mean using robust parameters in systematic sampling

In this article, we have proposed a generalized estimator for mean estimation by combining the ratio and regression methods of estimation in the presence of auxiliary information using systematic sampling. We incorporated some robust parameters of the auxiliary variable to obtain precise estimates of the proposed estimator. The mathematical expressions for bias and mean square error of proposed the estimator are derived under large sample approximation. Many other generalized ratio and product-type estimators are obtained from the proposed estimator using different choices of scalar constants. Some special cases are also discussed in which the proposed generalized estimator reduces to the usual mean, classical ratio, product, and regression type estimators. Mathematical conditions are obtained for which the proposed estimator will perform more precisely than the challenging estimators mentioned in this article. The efficiency of the proposed estimator is evaluated using four populations. Results showed that the proposed estimator is efficient and useful for survey sampling in comparison to the other existing estimators.


Introduction
In survey sampling, it is well known that no single estimation technique can always provide the best results for populations of different characteristics under different situations. In many real-life situations, the researchers may incline to use such sampling techniques which can provide better estimation results in limited time, cost, and effort. Compared to the other sampling designs, systematic sampling is often considered to be a better choice, as it is easy to apply and can provide increased precision of the estimators. In systematic sampling, the units are selected as per some criterion after selecting the first unit at random. Madow and Madow [1] first studied the theory of systematic sampling and considered it the most frequently used probability sampling design for population parameter estimation due to its easiness. Cochran [2] reviewed the applications of systematic sampling and concluded that "apart from being easy to implement, systematic sampling provides more efficient estimators as compared to the simple random sampling or stratified sampling for certain types of populations". Finney [3] and Zinger [4] discussed the application of systematic sampling for natural populations like forests. Cochran [5] cited many applications for systematic sampling in forestry, agriculture, and land surveys, and suggested that it can provide implicit stratification to produce better estimates for the situation when the sampling frame is in some specific order.
In survey sampling, the researchers collect information on the variable(s) that are correlated (either positively or negatively) with the main variable of interest. The use of such variables(s) along with the main study variable is very common to improve the efficiency of the estimator (s) for the population parameter(s) such as the mean, total, variance, proportion, etc. Usually, the ratio-type estimators are considered if the correlation is positive and the product-type estimators may be used when the correlation is negative between the study and the auxiliary variables. Moreover, many authors used some conventional and non-conventional parameters of the auxiliary variables to increase the precision of the estimators. Several authors have used systematic sampling for the situation when the auxiliary information was available with the concerned variable. Swain [6] suggested a ratio estimator whereas Shukla [7] proposed the product estimator using systematic sampling. Singh [8] suggested a ratio-cum-product type estimator in systematic sampling design. Srivastava and Jhajj [9] proposed a class of estimators for population mean using multi-auxiliary variables whereas Kushwaha and Singh [10] recommended a family of almost unbiased ratio and product estimators in systematic sampling. Banarasi et al. [11] proposed a family of ratio, product, and difference-type estimators using systematic sampling. Singh and Singh [12] proposed unbiased ratio and product type estimators in systematic sampling. Recently, some estimators were proposed using the exponent of the auxiliary variable by Singh et al. [13], Singh and Jatwa [14], Singh  In this article, we have proposed a generalized mean estimator using the auxiliary information with the expectation that the proposed estimator will perform better than the competing estimators. We have used some robust parameters associated with the auxiliary variable such as coefficient of kurtosis (β 2x ), upper quartile (Q 3x ), mid-range (M x ), Hodges-Lehmann (H x ), tri-mean (T x ), Gini's mean difference (G x ), Downton's method (D x ), probability-weighted moments (P x ) and deciles mean (DM x ). The generalized estimator also produces a family of sub-generalized estimators and sub-families of ratio and product-type estimators. In Section 2, we have discussed sampling methodology, notations, and some associated estimators. In Section 3, we derived the expressions for bias and MSE of the proposed estimator. Many special cases have been discussed in which the proposed estimator reduces to ratio and product-type estimators. Mathematical comparisons of the proposed estimator with the existing estimators are given in the same Section. An extensive numerical study is conducted using two real and two artificial populations in Section 4. A brief discussion of the paper is given in Section 5.

The methodology of systematic sampling and classical estimators
Consider a population P = (P 1 , P 2 , . . ., P j ,. . ., P N ) composed of N distinct elements in some specific order that labeled from 1 to N (1,2, . . .,j, . . ., N), where we mentioned the j th element on P denoted by P j for the simplicity. Further suppose that N is a product of two non-negative integers n and k, such that N = nk. Let Y is the main variable of interest and X be an auxiliary variable for which the values of both the study and the auxiliary variables can be defined by their label as y = (y 1 , y 2 ,. . .,y j ,. . ., y N ) and x = (x 1 , x 2 ,. . .,x j ,. . ., x N ) respectively. Let y ij and x ij be the values of the main study variable and the auxiliary variable on j th (j = 1, 2, 3, . . ., k) unit in the i th (i = 1, 2, 3, . . ., n) systematic sample. To select a sample of size n, we draw a random number between 1 to k (suppose it is j) and then every k th unit is selected such that j, j+k, j+2k, . . ., j+(n-1) k successive digits. Consequently, we have total k possible samples for each size n.
The means and variances for the study and the auxiliary variables for systematic samples may be obtained as The mean estimator (without having the auxiliary information) along with the variance equation is given bym The ratio and product estimators proposed by Swain (1964) and Shukla (1971) under systematic sampling design are defined asm The expressions for MSE for the estimators µ r, sys and µ p,sys are given by Here ρ y and ρ x are the intra-class correlations whereas C y and C x are the coefficients of correlation of their subscripts and ρ yx is the correlation coefficient between the subscript.

Proposed generalized estimator
In this section, we have proposed a generalized estimator using an auxiliary variable for the estimation of population mean by combining modified ratio and regression type estimators under systematic sampling aŝ With where b sys is the regression coefficient between the sample observations of the study variable and the auxiliary variable and v is a suitable constant to produce ratio type and product type estimators by assuming values 1 and -1 respectively. Here λ is an optimized constant used for the value of minimum MSE. Further, we have α (α6 ¼ 0) and γ, which may assume real numbers or different robust parameters (define in Appendix E) of the auxiliary variable X. Some notations are necessary to derive the bias and the MSE. Let Using the above notations, the proposed generalized estimator in (1) may be written aŝ Simplifying and expanding the above equation using the Taylor series up to the first order, we havem After simplification, we applied expectation to the above equation, and we get The mathematical expression of bias of the proposed estimator is We re-write (2) by using Taylor expansion up to the first-order m ðGÞ pi;sys À m y � m y x y À b sys þ m y lvc j � � x x : On squaring and applying the expectation of the above equation, we get The first-order MSE of the proposed estimator is

Optimum choice for scalar "λ"
For the minimum MSE of the proposed estimator, we differentiate (3) for "λ" and equate to zero to get the optimum value of scalar λ as The minimum MSE of the proposed estimator is given by

MSE minm
The MSE minm ðGÞ pi;sys � � can be obtained by replacing S 2 y , r � y , ρ yx and λ opt their estimates S 2 y , r � y , r yx andl opt respectively.
Note that the proposed estimator may reduce to some classical estimators using different values of b sys , v, α, γ, and λ as shown in Table 1.

Algebraic comparisons
In this sub-section, we have made some algebraic comparisons to get the optimum conditions in which the proposed estimator performs superior to the competing estimators.
The proposed estimator performs better than the basic mean estimator if

Simulation study
For the assessment of the proposed estimator and its sub-cases, a simulation study is carried out for each the population separately. The statistics of all the populations are presented in Appendix B. The mathematical formulae to compute the absolute biases (ABs) and percentage relative efficiencies (PREs) for all the estimators are defined as

Sources-III:
A bivariate population for (Y, X) is generated using an R-package of uniform distribution with the parameters 2 and 50.

Source-IV:
A bivariate normal population of size N = 1000 (n = 200) is generated using an R-package with a mean vector and variance-covariance matrix: The following steps have been coded in R-language to get the results of ABs and the PREs.
Step À 1 From the described populations; we have computed the population mean of the auxiliary variable x: Step À 2 We consider different sample sizes for each population; to generate the samples that follow the systematic pattern i:e:; j; j þ k; j þ 2k; j þ 3k; . . . ; j þ ðn À 1Þk: Step À 3 For each sample size; the values of ABs and PREs are computed for all the estimators considered in this paper: Step À 4 The process in Step À 2 and Step À 3 is repeated 50; 000 times and the outcomes are reported in Table 2 From the numerical illustrations presented in Table 2, it is notice that the generalized estimator is more efficient than the competing estimators based on the results of ABs and PREs for all populations. Many special cases of ratio-type (t ðAÞ pi;sys and t ðFÞ pi;sys ) estimators are computed for populations I-III and product-type (t ðBÞ pi;sys and t ðHÞ pi;sys ) estimators for population IV (Appendix C). We have used v = 1 for populations I-III and v = -1 for population IV and observed that the ratio-type estimators performed well for the populations having positive correlation whereas the product-type estimators performed better for the population having negatively correlated auxiliary data with the study variable. Overall results revealed that the efficiency of the proposed estimator is every high than existing estimators for all populations.

Conclusion
Systematic sampling is often considered to be very useful for populations under different disciplines like agriculture, environmental, ecological, forestry, and marine science. It is also applied outside the afford-mention fields, as it is easy to understand and simple to execute. In the present study, our primary concern is to suggest a generalized estimator for the estimation of population mean using systematic sampling. The proposed estimator can produce several other estimators as special cases using different choices of defining parameters. We further suggested several efficient estimators as special cases of the proposed estimator using known parameters associated with the auxiliary variable. The optimum conditions are also derived under which the proposed estimator provide more precise estimates as compared to the existing estimators. We have analyzed the numerical comparison using two real and two artificial populations. The proposed estimator along with its sub-estimators has shown great efficiency than the sample mean, classical ratio, and product estimators for all populations. The proposed estimator performs better than the ratio and product estimator for v = 1, and -1. On the behalf of theoretical comparisons and numerical findings, we suggest that the proposed estimator is very efficient and useful for mean estimation using systematic sampling.
Only single auxiliary variable is considered in this research to propose the generalized estimator for mean estimation. In future, multi auxiliary variables would be incorporated to propose more generalized estimator in the presence of measurement error and nonresponse using systematic sampling.
Supporting information S1 Appendix. This file contains multiple tables. (DOCX)